Sleep Quality Estimation using Machine Learning and Thigh-Mounted Accelerometer Data in Free Living Conditions

(working title)

Esben Lykke, PhD student

06 februar, 2023

Background

  • Sleep plays a vital role in health, thus, improving the assessment of sleep–wake outside of a laboratory environment is critical
  • The gold standard (PSG) is costly and inconvenient.
  • Methods for estimating sleep/wake based on accelerometry exist, primarily from wrist-worn devices
  • Cole-Kripke and Sadeh algorithms are commonly used
  • determine in-bed time is difficult, usually set by sleep log and/or human scorers
  • detect wakefulness is difficult, worse performance in populations with sleep disorders
  • typically a two level analysis: epoch based and summarized across night(s)
  • Zmachine-derived sleep stats

Purpose…

But Esben, what about them sleep stages!?

  • I did free-living PSG recordings of sleep but…
    • Super fragile -> shitty data
    • Combersome and time consuming
    • free-living?
    • would surface skin temperature + acc be enough? Most likely needs HR

It was likely a dead end from the get-go :(

Methods

data preparation, big time-consumer is handling raw acc data

only thigh data used. HSBC and other is only thigh data…

could be interesting to build models on thigh and hip ocmbined.

all zm recording is considered as in-bed (sensor problem?)

no sleep stages, only sleep/awake

sensor problems during sleep, up to 20 consecutive epochs (200 sec) are treated as sleep

Exclusion Criteria

Features

Basic Features

  • Weekday
  • Time of Day
  • Placement
  • Temperature

ACC derived features1

  • Mean ACC X
  • Mean ACC Y
  • Mean ACC Z
  • Standard Deviation X
  • Standard Deviation Y
  • Standard Deviation Z
  • Max Standard Deviation
  • Inclination

Sensor-Independent Features2

  • Clock Proxy Linear
  • Clock Proxy Cosinus

Human Circadian Clock

Forger, Jewett, and Kronauer (1999): a so-called cubic van der Pol equation

\[\frac{dx_c}{dt}=\frac{\pi}{12}\begin{cases}\mu(x_c-\frac{4x^3}{3})-x\begin{bmatrix}(\frac{24}{0.99669\tau_x})^2+kB\end{bmatrix}\end{cases}\]

This thing is dependent on ambient light and body temperature!

Walch et al. (2019) incorporated this feature using step counts from the Apple Watch

But as demonstrated by Walch et al. (2019), a simple cosine function does the tricks just as well :)

Circadian Proxy Features

Circadian Proxy Features

building Models

Results

  • Performance Metrics
    • F1 Score
    • Accuracy
    • Sensitivity
    • Specificity
    • ROC curves
  • Agreement With Zmachine Sleep Stats
    • Sleep Period Time
    • Total Sleep Time
    • Sleep Efficiency
    • Latency Until Persistent Sleep
    • Wake After Sleep Onset

ROC Curves

Performance Metrics
Grouped by Event Prediction
Logistic Regression Neural Network Decision Tree XGboost
In-bed Prediction
F1 Score 90.88% 93.69% 93.37% 93.77%
Accuracy 92.87% 94.81% 94.46% 94.85%
Sensitivity 85.43% 92.64% 93.83% 93.16%
Precision 97.07% 94.75% 92.92% 94.39%
Specificity 98.17% 96.35% 94.91% 96.06%
Sleep Prediction
F1 Score 86.57% 89.59% 89.34% 89.62%
Accuracy 90.77% 92.41% 92.10% 92.39%
Sensitivity 84.65% 92.95% 94.20% 93.49%
Precision 88.59% 86.47% 84.96% 86.06%
Specificity 94.09% 92.12% 90.96% 91.79%

Performance of the models to predict each class seperately, i.e., “sleep” and “in-bed”.

Performance Metrics
Grouped by Event Prediction
Logistic Regression Neural Network Decision Tree XGboost
In-Bed Awake Prediction
F1 Score 15.88% 25.45% 26.41% 27.54%
Accuracy 92.05% 92.95% 93.04% 93.26%
Sensitivity 11.67% 18.73% 19.44% 19.93%
Precision 24.83% 39.69% 41.18% 44.58%
Specificity 97.57% 98.05% 98.09% 98.30%
In-Bed Sleep Prediction
F1 Score 86.56% 89.54% 89.35% 89.61%
Accuracy 90.76% 92.39% 92.11% 92.38%
Sensitivity 84.61% 92.69% 94.18% 93.45%
Precision 88.60% 86.60% 84.99% 86.07%
Specificity 94.10% 92.23% 90.98% 91.80%

Performance of the models to predict each combined class, i.e., “sleep” + “in-bed”.

Bland-Altman Plots

In-bed classification flow

Sleep classification flow

Discussion

  • heteroscedasticity
  • Cheung 2018 table 4: actigraphy provides a sufficiently narrow range of possible mean differences (CI 95%) clinical significant thresholds

References

Forger, D. B., M. E. Jewett, and R. E. Kronauer. 1999. “A Simpler Model of the Human Circadian Pacemaker.” Journal of Biological Rhythms 14 (6): 532–37. https://doi.org/10.1177/074873099129000867.
Hirshkowitz, Max, Kaitlyn Whiton, Steven M Albert, Cathy Alessi, Oliviero Bruni, Lydia DonCarlos, Nancy Hazen, et al. 2015. “National Sleep Foundation’s Sleep Time Duration Recommendations: Methodology and Results Summary.” Sleep Health, 4.
Skotte, Jørgen, Mette Korshøj, Jesper Kristiansen, Christiana Hanisch, and Andreas Holtermann. 2014. “Detection of Physical Activity Types Using Triaxial Accelerometers.” Journal of Physical Activity and Health 11 (1): 76–84. https://doi.org/10.1123/jpah.2011-0347.
Walch, Olivia, Yitong Huang, Daniel Forger, and Cathy Goldstein. 2019. “Sleep Stage Prediction with Raw Acceleration and Photoplethysmography Heart Rate Data Derived from a Consumer Wearable Device.” Sleep 42 (12): zsz180. https://doi.org/10.1093/sleep/zsz180.